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Abstract 

Background: Just as power, type I error of cluster detection tests (CDTs) should be spatially assessed. Indeed, CDTs' 
type I error and power have both a spatial component as CDTs both detect and locate clusters. In the case of type 
I error, the spatial distribution of wrongly detected clusters (WDCs) can be particularly affected by edge effect. This 
simulation study aims to describe the spatial distribution of WDCs and to confirm and quantify the presence of 
edge effect. 

Methods: A simulation of 40 000 datasets has been performed under the null hypothesis of risk homogeneity. The 
simulation design used realistic parameters from survey data on birth defects, and in particular, two baseline risks. 
The simulated datasets were analyzed using the Kulldorff's spatial scan as a commonly used test whose behavior is 
otherwise well known. To describe the spatial distribution of type I error, we defined the participation rate for each 
spatial unit of the region. We used this indicator in a new statistical test proposed to confirm, as well as quantify, 
the edge effect. 

Results: The predefined type I error of 5% was respected for both baseline risks. Results showed strong edge effect 
in participation rates, with a descending gradient from center to edge, and WDCs more often centrally situated. 

Conclusions: In routine analysis of real data, clusters on the edge of the region should be carefully considered as 
they rarely occur when there is no cluster. Further work is needed to combine results from power studies with this 
work in order to optimize CDTs performance. 
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Resume 

Contexte: Les tests de detection de clusters (CDT) permettent a la fois de detecter et de localiser les clusters. Au 
meme titre que pour la puissance, il est done necessaire d'etudier la repartition spatiale de I'erreur de type I de ces 
CDT. Dans le cas de I'erreur de type I, la repartition spatiale des clusters detectes a tort (WDC) peut etre 
particulierement concernee par un effet de bord. Cette etude de simulation a pour objectif de decrire la 
distribution spatiale des WDCs et de confirmer et quantifier la presence de cet effet de bord. 
(Continued on next page) 
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(Continued from previous page) 

Methodes: Ce travail s'appuie sur la synthese de 40 000 jeux de donnees simulant I'hypothese nulle 
d'homogeneite spatiale des risques. Les simulations etaient fondees sur les parametres reels de donnees d'un 
registre de malformations congenitales, et notamment sur deux risques de base reels. La description de la 
distribution spatiale de I'erreur de type I nous a conduits a definir le concept de taux de participation de chaque 
unite spatiale de la region. Cet indicateur a ensuite ete integre pour la construction d'un nouveau test statistique 
destine a confirmer et quantifier I'effet de bord. 

Resultats: La valeur globale de I'erreur de type I a 5% a bien ete retrouvee. Les resultats montraient un tres net 
effet de bord avec un gradient decroissant du taux de participation depuis le centre vers le bord, les WDC etant 
plus souvent situes en zone centrale. 

Conclusions: Lors de la mise en oeuvre des CDT sur donnees reelles, les detections de clusters pres du bord d'une 
region d'etude doivent etre examinees avec la plus grande attention, ces dernieres etant tres rares en I'absence de 
cluster reel. II est maintenant necessaire d'orienter de futurs developpements vers la combinaison de ces resultats a 
ceux des etudes de puissance, et ce dans le but d'optimiser les performances des CDT. 



Background 

Spatial clusters can be detected using a wide range of 
statistical tests [1,2] many of which are available in free 
software such as R [3,4]. Epidemiologists use cluster 
detection tests (CDTs) to detect clusters without a priori 
knowledge either of their number or their location, and 
to determine their significance. CDTs performance being 
a function of epidemiological and geographical context 
[1,5-11], it is recommended to perform power studies be- 
fore using these tests in a particular region for a given 
phenomenon. However, statistical power is not the only test 
characteristic determining performance. Performance at 
large depends on two type of risks: type I and type II errors. 

In presence of clusters, usual statistical power (l-(3) is 
not sufficient to assess CDT performance to reject the 
null hypothesis of risk homogeneity. At worst, a CDT 
could have a maximum power to reject this null hypoth- 
esis of risk homogeneity but never correctly locate the 
true cluster. Similar concern can be raised for type I 
error. A CDT could, under the null hypothesis of no 
cluster, generate wrongly detected clusters (WDC) pref- 
erentially localized in particular zones of the studied 
region. The overall type I error could effectively be equal 
to its predefined value usually set to 5%, but the inter- 
pretation of the analyses would certainly not be the same 
for detected clusters inside or outside such zones. 

In the case of statistical power, authors have since used 
either evaluation of power and location by different indi- 
cators [6,12-14] or concomitant evaluation of both with 
a single measure such as the extended power [15,16]. 
The development of single measure of performance 
taking into account both power and location accuracy 
has enabled systematic spatial evaluation of performance 
on entire regions [15]. The question of the spatial evalu- 
ation of CDT is, so far, not totally answered with regards 
to power because evaluation of factors such as relative 
risks or cluster shape and size are still assessed by a 



non- systematic approach based on more or less arbitrary 
settings in simulation designs. 

The question of relative risks and clustering character- 
istics is not relevant in the spatial evaluation of type I 
error, other factors have to be taken into account, how- 
ever. First, there is still one epidemiological factor that 
requires setting: the baseline risk. For an applicative 
purpose, the use of the baseline incidence of the studied 
disease is the evident choice, but for research, a system- 
atic evaluation over a wide range of this factor should be 
carried out. Second, simulation studies evaluating type I 
error are much more likely to be influenced by edge 
effect [17-19] than power studies. Indeed, in the majority 
of simulation studies assessing power, edge effect is 
largely lessened by designs simulating clusters wholly 
within the studied region. 

We aimed to evaluate CDTs regarding the spatial dis- 
tribution of type I error. Such description was carried- 
out at the level of the spatial unit (SU) introducing the 
concept of SUs participation rate. We proposed a statis- 
tic to quantify and test for edge effect which was of par- 
ticular interest. We used Kulldorff spatial scan statistic 
as an example of CDT, whose behavior is otherwise well 
known, and performed a simulation study using realistic 
parameters from survey data on birth defects. 

Methods 

Disease modeling 

The study region was the Auvergne region (France), 
divided into n = 221 spatial units (SUs) equivalent to U.S. 
ZIP codes. We applied two baseline risks (incidences) of 
birth defects to the same at-risk population, whose size 
was approximated by mean annual number of live births. 

For a realistic analysis, we used data archived in CEMC 
(birth defects registry for the Auvergne region) and INSEE 
(National Institute of Statistics and Economic Studies) 
databases. We collected two categories of data from 1999 
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to 2006: all birth defects and cardiovascular birth defects. 
Both datasets were sorted by SU. The number of live 
births was approximated by the number of birth decla- 
rations in the at-risk population. Global annual inci- 
dences of all birth defects (I a n) and cardiovascular birth 
defects (I cv ) were estimated at 2.26% and 0.48% of births, 
respectively. 

Datasets 

We generated 20 000 datasets for each baseline risk, i.e. 
a total of 40 000 datasets. 

Each dataset is entered as a table of 221 rows and 5 
columns. The rows contain the coordinates (longitude 
and latitude) of a SU, the observed number of cases, the 
size of the at-risk population (i.e., the number of live 
births) and the expected number of cases in the specified 
SU. This last quantity is the product of the global inci- 
dence (I a n or I cv ) and the at-risk population size in the 
SU. The observed case numbers are assumed as inde- 
pendent Poisson variables such that 

E(Ni) = faNi-Poisfyii), i = 1, n 

where N t is the observed number of cases, and fa denotes 
the expected number of cases in the ith. SU under the null 
hypothesis of risk homogeneity. 




Figure 1 Representation of the medial axis of the Auvergne 
region. 

V J 
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Assessment of type I error 
Overall rate 

The global type I error rate was estimated by the proportion 
of WDC over the 20 000 datasets for each baseline risk. 

Spatial distribution 

SU participation rate: Participation rate of each WDC in 
the overall type I error is equal to i/m, with m the number 
of WDCs. Participation rate of each SU in the overall type I 
error was estimated by a weighted sum of the number of 
times each SU was included in a WDC. This weight is a 
function of m and the length of each WDC (number of SUs 
within). For each SU i among the n SUs of the region, the 
participation rate P t in the overall type I error is such that 

m 

Pt = E^( ml )) _1 

where m is the number of WDCs, / y is the length of the 
yth WDC and I,y a binary indicator equal to 1 when 
the /th SU is within the ;th WDC and 0 otherwise. By 

n 

construction, P t > 0 and ^^Pi = 1> where n is the num- 

i=l 

ber of SUs in the region. 

Edge effect: The edge effect is defined here as an in- 
homogeneous distribution of P t characterized by a 



gradient from the medial axis (or cut locus or skeleton) 
of the region to its edge. This gradient can either be 
ascending or descending. The medial axis is the set of 
all points having more than one closest point on the 
regions edge [20-23]. The Figure 1 shows the medial 
axis of the region under study a . For such a simple poly- 
gon, the medial axis is a tree whose leaves are the verti- 
ces and whose edges are straight segments reflecting 
local symmetries of the shape. 

To confirm the presence of an edge effect, we propose 
a test whose statistic, referred to as £, is such that 



£ = ^e i (P i -n- 1 
2d- 



i=l 
£i = I 1- 



D 



Where d t is the minimal Euclidian distance between 
the centroid of the ith. SU and the edge of the region, 
D the maximum Euclidian distance between any point 
of the medial axis and the region closest edge, and n the 
number of SUs in the region. By construction, as 0 > d t > 
D, -1 > £;>+l. The coefficient is a continuous indi- 
cator quantifying how much a point can be considered 
"on the edge" of the region. It is referred to as "the edge 
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coefficient" in the remainder of this paper. For any point 
in the region, the closer to the edge, the higher the edge 
coefficient, and the closer to the medial axis, the smaller 
the edge coefficient. The edge coefficient ranges from -1 
for the most "central/medial" points of the region to +1 
for points on the edge. For a study region divided into 
census tract, each SU is attributed the edge coefficient of 
its centroid. All SUs with the same edge coefficient are at 
the same distance to the edge and the closer to the medial 
axis, the smaller the edge coefficient, tending to -1 for the 
most "central" SUs of the region. 
The test hypotheses are expressed by 

f H 0 : E = 0 

The quantity n 1 is the expected participation rate for 
all SUs under the null hypothesis of spatial homo- 
geneity in type I error. When P t is higher than expected 
towards the edge of the region, by construction, it is 

n 

lower towards the center (as ^^Pj = 1) and there is an 

i=l 

ascending gradient. On the contrary, when P t is higher 
towards the center of the region, there is a descending 
gradient. The statistic E is positive when there is an 



ascending gradient of P t and negative when the gradient is 
descending. Indeed, in case of an ascending gradient 

• central SUs will tend to have 

et < 0, (Pi-n l ) < 0 (1) 

• border SUs will tend to have 

et > 0, (Pt-n 1 ) > 0 (2) 

and E will tend to be highly positive. 
In case of a descending gradient 

• central SUs will tend to have 

8i < 0, (Pi-n l ) > 0 (3) 

• border SUs will tend to have 

et > 0, (Pi-n l ) < 0 (4) 

and E will tend to be highly negative. 

Finally, under H 0 of spatial homogeneity of type I error, 
the sum of all P b equal to 1, is homogeneously distributed 
among the n SUs with an expected participation rate 




Pi 

Figure 4 Empirical distribution of SUs participation rates computed over 20 000 simulated datasets for two baseline incidences of 
birth defects: 2.26% (lall) and 0.48% (lev). 
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equal to rf \ Under this null hypothesis, the expected value 
of {Pi-rT 1 ) is null and independent to E t . Consequently, 
under null hypothesis, the expected value of E is null 

Since the variance of the E statistic under H 0 (spatial 
homogeneity of type I error) is unknown, we used 
Monte Carlo simulation where the n observed P t were 
randomly distributed 99 999 times among the n SUs in 
the region. The p-value was the proportion of elements 
among the collection of simulated and observed statistics 
which were greater than or equal to the observed value. 
The precision of this p-value was thus of 10 5 digits. 

Kulldorffs spatial scan statistic 

In this study, we selected Kulldorffs spatial scan statistic 
[24,25] as a well-known and widely used CDT whose per- 
formance has been studied by many authors [1,6,10,26]. 
The spatial scan statistic detects the most likely cluster 
on locally observed statistics of likelihood ratio tests. 
The scan statistic considers all possible zones z defined 




Figure 5 Size of the at-risk population for each SU in the Auvergne region, as defined by mean number of live births per year between 
1999 and 2006 (source: INSEE). Q1: < 17; Q2: > 17 and < 35; Q3: > 35 and < 70; Q4: > 70. 



by two parameters: a center that is successively placed 
on the centroid of each SU, and a radius varying 
between 0 and a predefined maximum. The true geog- 
raphy being delineated by administrative tracts, each 
zone z defined by all SUs whose centroids lie within the 
circle, is irregularly shaped. Let N z and n z be respect- 
ively the size of the at-risk population and the number 
of cases counted in zone z (over the whole region, these 
quantities are the total population size N and the total 
number of cases n). The probabilities that an at-risk 
case lies inside and outside zone z are respectively 
defined by p z = n z /N z and q z = (n-nJ/fN-Nz). Given the 
null hypothesis of risk homogeneity H 0 : p z = q z , versus 
the alternative Hi: p z > q z and assuming a Poisson 
distribution of cases, Kulldorff defined the likelihood 
ratio statistics as proportional to 

(&*fe)*V>»*>- 
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where A (here equal to I a n or I cv depending on the case 
considered) is the global incidence and the indicator 
function / equals 1 when the number of observed cases 
in zone z exceeds the expected number under H 0 of 
risk homogeneity, and 0 otherwise. The circle yielding 
the highest likelihood ratio is identified as the most 
likely cluster. The p-value is obtained by Monte Carlo 
inference. 

Over the 40 000 simulated datasets, each test was per- 
formed with a maximum size of zone z set to 50% of the 
total at-risk population, a number of 999 Monte Carlo sam- 
ples for significance measures, and an alpha level set to 5%. 

Software 

Data simulation and analysis were performed on R 2.14.0 
[3,27-29], using the function "kulldorfF of the SpatialEpi 
package [27] to perform the Kulldorff s spatial scan. 

Results 

Overall rate and WDC characteristics 

The overall type I error rate was 5.11% (1021 WDC 
over 20 000 datasets; CI 95% [4.80%, 5.42%]) for I a n 



and 5.06% (1012 WDC over 20 000 datasets; CI 95% 
[4.76%, 5.38%]) for I cv . The average size of WDCs was 
21.4 SUs (minimum 1SU, median 11 SUs, maximum 
116 SUs) and 23.4 SUs (minimum 1SU, median 11 SUs, 
maximum 132 SUs), respectively. The Figure 2 shows 
the empirical distribution of the WDC size for each 
baseline risk. 

SUs participation rates 

Figure 3 shows the SUs participation rates for baseline 
risks I a n (Figure 3a) and I cv (Figure 3b). The expected 
participation rate (n 1 ) for each SU is equal to 0.452%. 
With 0.452% ± 0.147% (mean ± standard deviation) for I aU 
and 0.452% ± 0.148% for I cv , the two observed distribu- 
tions of participation rates were very close to each other 
(Figure 4). The observed values varied from 0.097% to 
0.877% for I a u and from 0.091% to 1.03% for I cv . 

We sought for a correlation between P t and size of the 
at-risk population (Figure 5) by Spearman's rank test. 
Both coefficients were negative but none resulted in 
significant relationship (r = -0.13 with p-value = 0.056 for 
I a ii and r = -0.11 with p-value = 0.1 for I cv ). 




0.5 



no 



o S 



Figure 6 Values of the edge coefficient d computed over a regular sampling of 500 000 points within the region. 
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Edge effect 

Figure 6 shows the value of the edge coefficient s t com- 
puted for a regular sampling of 500 000 points within 
the region. Figure 7 shows the value of the edge coeffi- 
cient computed for the n = 221 SUs within the region. 

With E equal to -0.086 for I all and -0.074 for I cv , both 
simulations resulted in descending gradient of P b i.e. 
higher P t for central SUs. As shown by E values, this 
gradient was stronger for I all than for I cv . 

As shown in Figure 8, the SUs contributing to the 
overall type I error for more than n 1 {P t > n 1 ) were 
mostly located away from the border of the region. The 
black line delineates a central zone where the edge coef- 
ficient is negative and a complementary zone where the 
edge coefficient is positive. Within the central zone, red 
SUs contribute negatively to E (see Equation 3), on the 
contrary, outside the central zone, red SUs contribute 
positively to E (see Equation 4). 

Both tests were highly significant, with Monte Carlo 
p-values both equal to 10" 5 (99 999 replicates). Figure 9 
shows the simulated null distributions of E and the 
observed values for the two simulated baseline risks. 



Discussion 

We have shown that type I error is heterogeneously dis- 
tributed with a descending gradient from center to edge. 
Even if global type I error is very near the predefined 5%, 
WDCs are rarely located on the edge of the map. In a 
survey system, where sensitivity matters over specificity, 
it could be argued that since global type I error is pre- 
served, the global cost in unfruitful secondary investiga- 
tion is not affected by the spatialization of type I error. 

Our work did not aim to test for clustering in type I 
error rate and thus we did not used CDTs to analyze 
the spatial distribution of P t . We note, however, that 
methods such as Bayesian smoothing could be of 
interest in the description of the spatial distribution 
of type I error. As the presence of an edge effect with 
descending gradient was obviously expected, our contri- 
bution aimed to describe, quantify and test for this edge 
effect. Furthermore, within a given region, the spatial de- 
scription of type I error makes possible to see with pre- 
cision which detected clusters should be carefully 
considered because they are less likely to coincide with 
false alarm. 



i 




BJp* W 



Figure 7 Values of the edge coefficient d computed for each SU within the region. Each SU is assigned the value of the edge coefficient ei 
computed for its centroid. 
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Figure 8 Overlaying of the sign of the difference between observed and expected values of SUs participation rates (computed over 20 
000 simulated datasets for each map) and the sign of the edge coefficient d (d negative in hatched area), (a) Baseline incidence of birth 
defects set to 2.26%. (b) Baseline incidence of birth defects set to 0.48%. 
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Figure 9 Histograms of the null distribution (99 999 replicates) and observed values of E for two baseline incidences of birth defects: 
2.26% (lall) and 0.48% (lev). 
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The edge effect was present and strong, no matter the 
baseline risk. Only two levels have been tested for this 
risk. One could wonder about a possible correlation 
between edge effect and the level of baseline risk. Levels 
at regular interval between these two baseline risks are 
currently being explored and there is no evidence of 
such a correlation so far (data not shown). 

The edge effect is indisputable in this study (Figure 8) 
and the statistic E has consequently resulted in a highly 
significant test. This statistic is based on the edge coefficient 
Si that defines what is "on the edge" of the map and what is 
not. By using medial axis, we proposed a distance-based 
definition, but other parameters could be considered. For 
instance, it could be useful to distinguish between two SUs 
at the same distance to the edge but in different configura- 
tions with one in a "peninsula" (between two edges) and 
thus more isolated than the others. To be accounted for, 
this factor needs geometrical tools to characterize the 
spatial isolation. 

Aside from a purely geometrical definition of what is 
an edge, confounding factors should also be taken into 
account. Suppose that the at-risk population is heteroge- 
neously distributed, with more populated areas centrally 
localized. Then, suppose again that the at-risk popula- 
tion size is negatively correlated to participation rate 
(this was not the case in our study). Our test for edge 
effect might turn out to be significant, concluding in an 
ascending gradient of P t from center to edge, only due 
to this confounding factor. In our simulations, the 
at-risk population is effectively more centrally localized. 
If the negative correlation between population size and 
Pi had been significant, we would have an even stronger 
evidence for a descending edge effect regarding P t from 
center to edge, because our results, that turned out to be 
significant, would have actually been underestimated. 

Even if we did not find any relationship between popu- 
lation size and participation rate, other factors (such as 
the number of neighbors, the accessibility by road or rail 
system, etc.) should be evaluated. The best way to deal 
with these confounding factors might be to integrate 
them in the construction of E t for geographical factors 
or to replace the constant n" 1 by a vector of expected 
participation rates for epidemiological factors. For the 
E statistic to be equal to 0 under H 0 (spatial homogen- 
eity of type I error), this last adaptation should be done 
in such a way that the sum of all expected participation 
rates stays equal to 1. 

Our results highlight the edge effect in type I error, 
and thus can help the interpretation of real data analysis. 
It could be even more useful to provide a way to inte- 
grate spatial heterogeneity of type I error in the analysis 
itself. Furthermore, adjustment in CDT behavior should 
be done to address this issue only if it does not impede 
the tests' power. In a previous simulation study on CDT 



performance, we proposed a method to build perform- 
ance map based on a systematic spatial evaluation [15]. 
The now available data for both Hi (single clusters of 
4SUs in this previous study) and H 0 (risk homogeneity) 
in similar settings (same baseline risk and population 
size) will enable us to study whether and how it could 
be gainful to add a spatial adjustment of type I error. 

Conclusion 

Spatial heterogeneity of type I error should be consid- 
ered when interpreting analysis of real data, because of 
the strong edge effect. This work clearly shows that a 
detected cluster on the edge of the region of interest 
is less common when no alarm should be raised. To 
explore all avenues, assessment of edge effect and its 
factors, as well as development of tools to integrate it in 
routine health survey, should be considered. 

Endnotes 

Computation of the straight skeleton was performed 
using [30] and the results were imported and displayed with 
JTS Topology Suite [31], a software under GNU license. 
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